HiFun - A High Level Functional Query Language for Big Data Analytics

نویسندگان

  • Nicolas Spyratos
  • Tsuyoshi Sugibuchi
چکیده

We present a high level query language, called HiFun, for defining analytic queries over big data sets, independently of how these queries are evaluated. An analytic query in HiFun is defined to be a wellformed expression of a functional algebra that we define in the paper. The operations of this algebra combine functions to create HiFun queries in much the same way as the operations of the relational algebra combine relations to create relational algebra queries. The contributions of this paper are as follows: (a) defining a formal framework (i.e. HiFun) in which to study analytic queries in the abstract in much the same way as the relational algebra provides the formal framework to study relational algebra queries; (b) showing that each HiFun query can be encoded as a map-reduce job, and also as a SQL group-by query when the data set is a relational database; and (c) defining a formal method for rewriting HiFun queries and, as a case study, showing how our method can be applied in the rewriting of map-reduce jobs and of SQL group-by queries. We emphasize that, although theoretical in nature, our work uses only basic and well known mathematical concepts, namely functions and their basic operations. 1

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Trill: A High-Performance Incremental Query Processor for Diverse Analytics

This paper introduces Trill – a new query processor for analytics. Trill fulfills a combination of three requirements for a query processor to serve the diverse big data analytics space: (1) Query Model: Trill is based on a tempo-relational model that enables it to handle streaming and relational queries with early results, across the latency spectrum from real-time to offline; (2) Fabric and L...

متن کامل

Trill: Engineering a Library for Diverse Analytics

Trill is a streaming query processor that fulfills three requirements to serve the diverse big data analytics space: (1) Query Model: Trill is based on the tempo-relational model that enables it to handle streaming and relational queries with early results, across the latency spectrum from real-time to offline; (2) Fabric and Language Integration: Trill is architected as a high-level language l...

متن کامل

Big Data Analytics and Now-casting: A Comprehensive Model for Eventuality of Forecasting and Predictive Policies of Policy-making Institutions

The ability of now-casting and eventuality is the most crucial and vital achievement of big data analytics in the area of policy-making. To recognize the trends and to render a real image of the current condition and alarming immediate indicators, the significance and the specific positions of big data in policy-making are undeniable. Moreover, the requirement for policy-making institutions to ...

متن کامل

A Fuzzy TOPSIS Approach for Big Data Analytics Platform Selection

Big data sizes are constantly increasing. Big data analytics is where advanced analytic techniques are applied on big data sets. Analytics based on large data samples reveals and leverages business change. The popularity of big data analytics platforms, which are often available as open-source, has not remained unnoticed by big companies. Google uses MapReduce for PageRank and inverted indexes....

متن کامل

SnappyData: Streaming, Transactions, and Interactive Analytics in a Unified Engine

In recent years, our customers have expressed frustration in the traditional approach of using a combination of disparate products to handle their streaming, transactional and analytical needs. The common practice of stitching heterogeneous environments in custom ways has caused enormous production woes by increasing development complexity and total cost of ownership. With SnappyData, an open s...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2017